Basic Statistics

Raw Counts

Name Value
Rows 41,816
Columns 30
Discrete columns 26
Continuous columns 4
All missing columns 0
Missing observations 567,783
Complete Rows 3,072
Total observations 1,254,480
Memory allocation 21.4 Mb

Percentages

Data Structure

Missing Data Profile

Univariate Distribution

Histogram

Bar Chart (by frequency)

## 15 columns ignored with more than 50 categories.
## dateAwarded: 177 categories
## honourDate: 195 categories
## pageCreation: 2691 categories
## fullName: 41462 categories
## name: 4368 categories
## name2: 4339 categories
## personLabel: 4592 categories
## personDescription: 2111 categories
## wikidataID: 4368 categories
## suburb: 8219 categories
## postcode: 2591 categories
## wikipediaURL: 4368 categories
## honoursURL: 4274 categories
## citation: 34501 categories
## announcement: 106 categories

QQ Plot

## Warning: Removed 2688 rows containing non-finite values (stat_qq).
## Warning: Removed 2688 rows containing non-finite values (stat_qq_line).

Correlation Analysis

## 15 features with more than 20 categories ignored!
## dateAwarded: 109 categories
## honourDate: 98 categories
## pageCreation: 2043 categories
## fullName: 3057 categories
## name: 2983 categories
## name2: 2970 categories
## personLabel: 2970 categories
## personDescription: 1442 categories
## wikidataID: 2983 categories
## suburb: 1588 categories
## postcode: 911 categories
## wikipediaURL: 2983 categories
## honoursURL: 3072 categories
## citation: 2813 categories
## announcement: 94 categories
## Warning in cor(x = structure(list(daysDiff = c(8955, 9103, 8392, 9061, 9939, : the standard deviation is zero

Principal Component Analysis

## 15 features with more than 50 categories ignored!
## dateAwarded: 109 categories
## honourDate: 98 categories
## pageCreation: 2043 categories
## fullName: 3057 categories
## name: 2983 categories
## name2: 2970 categories
## personLabel: 2970 categories
## personDescription: 1442 categories
## wikidataID: 2983 categories
## suburb: 1588 categories
## postcode: 911 categories
## wikipediaURL: 2983 categories
## honoursURL: 3072 categories
## citation: 2813 categories
## announcement: 94 categories
## Warning in plot_prcomp(data = structure(list(wikipediaPage = c("Yes", "Yes", : The following features are dropped due to zero variance:
##  * wikipediaPage_Yes
##  * wikidataEntry_Yes